Integrating the energy information into MFCC
نویسندگان
چکیده
The Mel-Frequency Cepstrum Coefficients (MFCC) is a widely used set of feature used in automatic speech recognition systems introduced in 1980 by Davis and Mermelstein [2]. In this traditional implementation, the 0 coefficient is excluded for the reason it is somewhat unreliable. In this paper, we analyze this term and find that it can be regarded as the generalized frequency band energy (FBE) and is hence useful, resulting in the FBE-MFCC. We also propose a better analysis, called the auto-regressive analysis, on the frame energy, which performs better than its 1 and/or 2 order differential derivatives. Experiments show that, the FBE-MFCC and the frame energy with their corresponding auto-regressive analysis coefficients form the better combination reducing the syllable error rate (SER) by 10.0% across a giant speech database, compared to the traditional MFCC with its corresponding autoregressive analysis coefficients.
منابع مشابه
Integrating information of the efficient and anti-efficient frontiers in DEA analysis to assess location of solar plants: A case study in Iran
The solar photovoltaic (PV) energy is one of the most promising sources of energy, which has attracted many interests. Itis potentially the largest source of energy in the world and is capable to mitigategreenhouse gas (GHG) emissions significantly in comparison with fossil fuels.Location optimization of solar plants can play a vital role to rise the efficiency and performance of the solar PV s...
متن کاملIntegrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification
This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...
متن کاملHigh Improvement of Speaker Identification and Verification by Combining Mfcc and Phase Information
In conventional speaker recognition methods based on MFCC, phase information has been ignored. We proposed a method that integrated the phase information with MFCC on a speaker identification method, and a preliminary experiment was performed. In this paper, we propose a new modified feature parameter (that is, coordidates on an unit circle) obtained from the original phase information, and eva...
متن کاملImproved phoneme recognition by integrating evidence from spectro-temporal and cepstral features
Gabor features have been proposed for extracting spectro-temporal modulation information, and yielding significant improvements in recognition performance. In this paper, we propose the integration of Gabor posteriors with MFCC posteriors, yielding a relative improvement of 14.3% over an MFCC Tandem system. We analyze for different types of acoustic units the complementarity between Gabor featu...
متن کاملIntegrating Complementary Features with a Confidence Measure for Speaker Identification
This paper investigates the effectiveness of integrating complementary acoustic features for improved speaker identification performance. The complementary contributions of two acoustic features, i.e. the conventional vocal tract related features MFCC and the recently proposed vocal source related features WOCOR, for speaker identification are studied. An integrating system, which performs a sc...
متن کامل